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ABSTRACT 


This thesis develops and explores the graphicadt agai 
of multivariate data sets through tne use of a Draftsman 
technigue of scatter plot displays. These plot displays are 
useful for determining associations and relationships 
between variables in order to promote an understanding of 
the characteristics of the aqata in exploratory and desc 
tive applications. General graphical enhancement techniques 
such as jittering and transformations are discussed and 
incorporated in the development of a computer program which 
produces Draftsman displays. A technical description of the 
Draftsman computer program is presented, and user implemen- 
tation procedures discussed. An acalysis 1S conducted on 
two varied sets of data t> demonstrate» the versatility aaa 
utility of the Draftsman display technigue for exploring 


data structures. 
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A. MOTIVATION 


Recent advances 
bilities have nade 
ewe riui diagnostic 


These same advances 


INTRODUCTION 


I. 


in computer nardware and sofware capa- 


available to a larger number of users 


data. 


however are responsible for a tremendous 


and analytical tools for exploring 


increase in the amount or data produced and available for 


aual ysis. Colttabyetouldathenatical intuition, the avail- 
moiety of more data availaple does not always lead to 
greater precision in subseyuent analysis. Often the 


increased amount of data confounds the analysis by overbur- 


dening our ability to process the information in a timely 


and understandable fashion. 
Graphical displays are a method of 
of 


benefit of graphical technigues is 


visually portraying 
The 
that the human eye-brain 
BY 
PEOpeErLY 


vast amounts “deed e ive  1LHEOR Nation. primary 


system has a powerful information processing capability. 


Maximizing our Vasladtamedpabl lity £0 process 


displayed data, we can rapidly Summarize information, focus 


On salient features, discern abberations, and extract 


for aiis of interest from a data set. 


Be SCOPE 


1s to use the visual 


EO 


The purpose of a Draftsman display 


of an of two dimenSional 


Enis 


exhaustive series of 


impact array Scatter plots 


Siar yze multivariate data. can be accomplished 


Gar 


analyst to observe 


by 
Cue 
the 
the 


arranging an plots CONnSis ting 


paired variables. This enables the 


influence of each variable on every other variable in 


data set. 
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The concept 93£ uSing two dimensional Se€ateem fom eueee 
display higher dimensional data structures 1S discusseiew 
text Sy -Chalbers ec tac [oRe@ eee 2 The ideas from thame 
text served as the foundation tor tne development of an 
interactive CoOgPpurece PLOQLran ce CONSTE dct Dratits mam 
displays. The additional features of enhancing scapeen 
plots such as the jittering of discrete data values and 
transforming variables can also be applied to this multidi- 
mensional display procedure. The full considerations that 
went into the program development as well as user intplemen- 
tation procedures is amplified in later chapters. 

The purpose of this thesis is to integrate tne graphical 
concepts of scatter flots, jittering, and transforming vases 
ables into a Draftsman display. Altnough written in A 
Programming Language (APL), little if any knowledge of this 
language is reguired to successfully utilize the program 

This thesis has been written in three major segments in 
order to appeal to the widest audience possible. The farst 
segment, composed of chapters II and III deals with the 
general concepts of graphical metnods and user instructions 
reguired to invoke the Drartsman display progran. The 
second segment, comprised of chapter IV and Appendix A, is 
aimed at those readers interested in the technical details 
and Draftsman program documentation. The final segment, 
found in chapters V and VI, contains a Stepwise analysiomem 
two varied forms of data to demonstrate potential applica- 
tions of this procedure in exploratory datavanaty ci 

The graphs used in this paper were produced by an exper- 
imental APL package GRAFSTAT, which the Naval Postgradudee 
School 1S using under atest agreement with the IBM Watson 
Researcn Center, Yorktown Heights, New York. we are 
grateful to Dr. PsD. ~ NEIGH “aide Diss P. Heidelberger for 
making GRAFSTAT available to us. 
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fee DATA@DISPLAYS 


Peattety Or Glhaphicam Mothods Suca as box plots, histo- 
Means, Stem and leaf displays and scatter piots are avail- 
aple to explore relationsnips which may exist between 
variables of a data set. The scatter plot is perhaps one or 
the most powerful graphical methods for displaying bivariate 
data. The foremost feature of tne scatter odlot is that all 
of the data of interest is readily displayed for visual 
interpretation. iMmddditlom, the “simplicity of construc- 
tion, compactness of the display, and adaptability to other 
graphical enhancement technigues, contribute to the power of 
iis display. 

In contrast, numerical summaries may reflect correlation 
Men cell little about clustering, patterns, oc other rela- 
tionships which might be present. This is particularly true 
of larger data sets consisting of more than twenty observa- 
tions and more than two variables. In these larger data 
sets, the sheer volume ort data points to be compared makes 
interpretations a tedious and time consuming process. 

Fiyure 2.1 is a scatter plot of weight versus engine 
Mee lacement ror 106 different nodels of cars produced 
meen ifs [Refs 2 spp.320—-356]. A numerical summary might 
readily impart the fact that an increase in car weight is 
associated with an increase in engine size. The scatter 
plot however rapidly makes apparent some other interesting 
features. We can see that the observations consist of two 
Met anet groupings. For vehicles under 3,000 lEs there is a 
strong positive linear dependency between weight and 


displacement. For the heavier venicles over 3,000 lbs, 


fez 


increaSing weight stili tends to be correlated with eeeeae 
displacements, though more dispersed in L[orm. A numerical 


Summary would not so easily reveal tnese relationshigs. 
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Figure 2.1 Basic Scatter Plow. 


For many applications, our interest may extend bevond 
bivariate data sets tc larger multidimensional sets. As in 
the bivariate case, scatter plots may also be used to graph- 
ically display multivariate data sets. An exhaustive series 
or plots consisting of all paired variables performsia 
Similiar function as the single scatter plot does for bivar- 
late data. By properly aligning the plots So that a Coniiog. 
ality of axis exists between every plot and the adjacent 
plots, we can not cnly observe the relationships within a 
Specific plot but may also follow particular obervations or 
groups of observations through the succesive plots to 
analyse the influence of other variables. This particule 
teconigue of arranging the scatter plots iS Similiar Geom 
draftsman drawing of a three dimensional object and hence is 
termed a Draftsman display. [Ref. 1 :p.136] 

The three dimenSionai draftsman display shown in figure 
2.2 Consists of the variables of weight, turning radius, and 
engine displacement for 1983 model cars. The first frow 
Shows the paired plots of weight versus turning radius and 
engine displacement. The second row is turning radius 


versus weight and engine ijiisplacement. The bottom row of 


i 


mioteameGonsist. Of eNgine displacement versus weight and 
turning radius. This arrangement of the plots, while some- 
what redundant, allows the viewer to scan acrosS rows OF 
down columns of plots, thereby aatcoing up points that 


correspond to the same observations in different plots. 
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Figure 2.2 Three DimenSional Draftsman Display. 


Observing the bottom row of plots in figure 2.2 , we can 
meaeckK three distinct groupings of points through all the 
baited plots. These three groups correspond to the small, 
medium, and large size categories of vehicles. A quick look 
at the associations exhibited in this display indicates also 
that engine displacement has a tigater relationship with 
weight than it does to turning radius. Other relationships 
are also evident and are presented in greater detail in the 


analysis presented in Chapter V. 
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Be JM@ITERING OF VARZABLES 


In certain cifccumstances, a SCAtterplOtwia yee oe eee 
visually. deceiving due to tne overlapping of data points 
Withi i them, LoOer-. This situation nay be particularly preven 
lent when on2 or Loth of the plotted variables have a 
limited range of discrete values. In order to alleviatewae 
overlapping and enhance the actual relationships that exist, 
a small amount of random noise may be added to one or both 
of the variables to "jitter" their horizontal and vertical 
locations witnin the plot. The amount of random noise added 
or subtracted from the original data values must be suffi- 
Client to prevent overlapping but small enough so that the 
original data values can be recovered by rounding to the 
nearest whole number. Typically the random noise added is 
two to five percent of the total range of the variable 
values. [Ref 1 :pp.106-107 ] 

The visual difference resulting from jittering can be 
seen in figure 2.3, where the maintenance records for 1981 
versus 1982 was plotted for 106 automobile models. 
Maintenance is a category variable with values of 0, 1, ee 
ae Clearly a problem of overlapping exist in the fasic 
scatter plot seen on the left. The jittered version on the 
Fight 1S amore accurate picture of the distributicon@am® 


clustering prevalent in the data. 


Oe not rated 
l= very poor 
2= poor 

3= average 
4= good 

5= very good 
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Figure 2.3 Unjittered and Jittered Plots. 
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Game TRANSGEORMATZON OF VARIA BLES 


The primary purpose of employing transformations 1s to 
linearize and simplify the observed relationsnip between the 
variables plotted. In many instances the plots may be 
further enhanced through the use of transformations in order 
to achieve a Simpler and more understandable picture suit- 
able for visual ccmparisons and exploration. Hoaglin 
(Ref. 3 :pe- 104], proposes the following pertinent reasons 
for emplcying transformations. 

1. Facilitate interpretation in a natural way. 

2. Promote symmetry in a batch. 

3. Promote stable spread in several batches. 

4. Promote straightline relationships between the vari- 
ables. 

5. Simplify the structure of a two way or higher dimen- 
Sional data structure so that a simple additive model 
can assist in the understandiny of the characteris- 
ites OL the data. 

A key factor of transforming the variables 1s that 1f 
the correct transformation is applied, the resulting scatter 
plot will appear more linear in form. This in turn visually 
Peaances frecognition, detection of deviations and outliers, 
and assist in observing relationsnips or patterns. 

AS previously discussed, the basic scatter flot of 
weight and engine displacement 1s divided into two distinct 
groups of points as seen in the left plot of figure 2.4. 
While the lower group appears fairly linear, the upper group 
is more dispersed and curved in shape. The plot on the 
right shows the effect of applying a log transform to the 
engine displacement values. Times resulting) plot of the 
transformed data becomes more linear over the entire range 


of engine displacement values (see figure 2.4 ). 
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Figure 2.4 An Example of Transformation. 


A note of caution iS appropriate in determining when to 
use transformations in the Draftsman display. Since tlanse 
formations result ina cnange to the displayed values and 
scale, care must be taken to avoid confusion during subse- 
guent analysis. We should insure that the benefits of 
describing the data with a transformation is greater than 


the loss of simplicity incurred tarough its use. 
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Pein onkNeUonR INS T2UCTIONS 
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A. GENERAL GUIDANCE 


The Draftsman program was written in APL and is designed 
to be used in conjunction with the experimental IBM graphics 
software GRAFSTAT. The Draftsman program is interactive and 
reguires little knowledge of APL tod use. The APL versed 
user can easily modify the basic program and called subrou- 
tines for more specialized forms of analysis. 

The graphical software which generates the Draftsman 
displays requires the use of either the IBM 3277GA or 
3278/79 graphic display terminals [| Ref. 4}. Normally these 
terminals are available as public facilities with special 
accounts and passwords. Once logged on to one of these 
terminals the user may link back to their own account and 
copy any of their own files as desired. This is useful in 
retrieving data files which the user wishes to analyse with 


meocattsman display. [Ref. 5] 


Be USER REQUIREMENTS 


Piiseseeeron will provide a braef overview of the user 
mmaputs required to generate a Draftsman display of a data 
set. An explicit step-by-step description of all input 
reguirements is found at the conclusion of this chapter in 
megures 3.2 through 3.6 . 

Since the Draftsman program is written in APL, the user 
will have to enter the APL sub-environment in order to gain 
access to the graphics programs. Once in the APL environ- 
ment, the APL characters set is invoked by keying the APL ON 
key. These APL characters are found 1n red and supercede 


the normal keys. It is recommended that the first time user 
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take a few Minutes to familiarize tnemselves With (aero cae 
tions of these Charagetcrs: 

The APL environment will allow the user to copy and 
retrieve both the GRAFSTAI and Draftsman programs as snown 
feed: dG Wis CMe ner meee Once these programas are in the workspace 
the basic set up procedure is complete and the user is ready 
to actually initiate the Draftsman program to produce a 
display. The Draftsman program is initiated by typing 
DRAFTSMAN followed by return. The program will respond with 
aseries of terminal queries reguesting the various input 
parameters reguired in the display. Each guery is generated 
based upon the user response to the previous query. The 
general program schematic and input requirements is outlined 
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Figure 3.1 Draftsman Program Schematic. 
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Pee ese omenonmomesenmted 25 that Of inputing a data 
set (figure 3.3). Das uenmeidS —bCCD. DEevmouSly Copied 
Zrom another APL workspace may be entered by variabie name. 
Weta watch is Located on aCMS file can be automatically 
read into the workspace. Doedwnie a eis Tile Cam contain 
Smly Numeric Characters. A mixture of numeric and alpha- 
betic cnaracters will result in the data not being read in 
@omerectly. A crucial reguirement is that regardless of how 
the data is entered it must be in two dimensional arclay forn 
(cows and columns). TheseewunnSs of the data correspond to 
the variables, the rows to the dirferent observations on 
each variable. 

Once the basic data has been entered the user is 
presented with an option to have either all of the data or 
only a subsample of the data appear in the display. This 
allows Draftsman displays to be produced on either all the 
data, specified variables, a subpopulation of a variable, or 
aye combination thereof (figure 3.4 ). 

Based on the data selected, an option wiil be presented 
to enter the appropriate names or the variables which will 
mepear in the display (figure 3.5 ). These names are the 
labels which will appear on the axis of the plots. The 
variable names can be entered as a previously generated APL 
two dimensional array of characters. tie this method of 
input is selected, each row of the array must contain the 
Mhame of a variable in the same order as the variable is 
located in the data structure. Tne variable nanes may also 
be entered directly in response to a sequential series of 
queries. Once the variable names are entered, the mininmua 
input requirements needed to produce a Draftsman display 
have been completed. The remainder of the queries pertain 
to display enhancements which may be invoked if desired. 

Hoe “first enhancement option is that of jittering 


Beeguere 3.6 ). An input of 0 will result in no jittering of 
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the data. If jittering is desired, tne user Will be gmeew 
as to which variaples. The results of jittering appear vam: 
in the Draftsman display and do not permanently alter any of 
the values in the original data set. 

The second enhancement option available is transforma- 
tion of variables (figure 3.6 ). Here again, a response of 
Owill result in no transtorfdtions Oceana. If onewem 
more transformations are desired tne user is prompted for 
the variables and APL expression for tne transformation. A 

c 


Summary ror some of the more common transformations with 


examples is illustrated in Table I . 


| TABLE 


Sample APL Transformations 


TRANSFORM MATH FORM APL EXPRESSION | 

LOG LN X ox | 

LINEAR Ax +B B+AXxX | 

CUBIC x3 X43 | 

CUBE ROOT ene X+(—1/3) | 

| SQUARE MG X*2 | 
SQUARE ROOT =x”? X#(-1/2) | 

a _"® 


The Draftsman program will begin to display the compo- 
hent scatter plots on the graphics screen. The entire 
display is generated in segments of five variables. At the 
end of each segment an option is offered for the user to 


guit, continue, or to make a hardcopy and continue. 
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IV. DRAFTSMAN TECHNICAL IMPLEMENTATION 


Ae BASIC DRAFTSMAN HOUTINE 


The Draftsman program waS written in APL, Which is an 
array processing lanquage. The use of APL enables the 
Draftsman program to call and implement a variety of plot- 
ting functions available in the IBY GRAFSTAT graphical somes 
ware package. The GRAFSTAT software 1S an experimental 
graphics program currently under development by IBM. lt gs 
presently available at the Naval Postgraduate School for 
testing and evaluaticn purposes. [Ref. 4] 

A secondary benefit derived in using APL is the inherent 
user efficiency characteristics in terms of the large number 
of mathematical operations executable directly as keytroard 
entries. This approach is ideal for exploring data struc- 
tures and features of interest. [Ref. 6] 

The foundation of the Draftsman program revolves around 
the graphical plotting features of GRAFSTAT, and in papemes 
ular the scatter plot option. This option requires the user 
to input the two variables of interest, size and location of 
the plots, as well as any headings desired. Figure 4. ime 
an example or the scatter plot input screen, called The 
alphanumeric screen (ANS) in GRAFSTAT. 

Correct alignment of the series of scatter plots so that 
a commonality of axis between adjacent plots exist involves 
an automatic reiterative calling of the scatter) pice 
Graphies ~Lunct Won: Each input parameter of the tkasic 
scatter plot function is assigned as a local program vari- 
able. Based upon the input data structure, the Draftsman 
program sytematically selects the variables to be plotted, 


appropriately labels each axis, and determines the correct 
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| PLOT FUNCTION 


X VARIABLE(S) 

Y VARIABLES(S) 
TYPE(S) OF PLOT 
TYPE(S) OF LINE 
TYPE(S) OF SYMBOL : 
PLOT HEADER (IN QUOTES) ae 


- —-O<x 


SCREEN HEADER (IN QUOTES) : °' 


Y-AXIS LABEL (IN QUOTES) : YAXIS 
POSITION ; POSN 
SCALE X-AXIS : LIN LX TX SCAUE Y-AXIS@: LIN LY TY 
PARTIAL PLOT eal ele 
AXES AND GRID CONTROL a0 1-0 © 
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Figure 4.1 GRAFSTAT Scatter Plot Function Screen. 


location in which each of the plots will appear for display. 
This methodology produces the entire array of plots while 
eliminating the need for reiterative inputs by the user for 
eocn plot. The output is displayed as a row of scatter 
plots for each variable in the data set. 

HObeGddtidwmms tluctizes Consisting of five or less vari- 
ables, the Draftsman program display will fit ona single 
page. The piotting of five variables per page was selected 
to balance space limitations against the need for sufficient 
Gloamity Of detail within the plots. To accomodate more than 
five variables on a single page would reyuire smaller j;flots 
while reducing the visual usefullness of the display and 
making ccMparisons inconvenient. Less than five variables 
per page results inthe excessive use of costly graphic 
reproducing paper. HOmecditaae sets eXGeedung Liveswvariabkles, 
the Drattsman display is generated in segments which when 
Mepeoduced May be pasted together to form a completed 


aeoplay. 


28 


The segmental method of producing DJrartsman displays 
enables the user to display data sets consisting of more 
than five variables. The display procedure is limited only 
by the workspace capacity of the user computing facility. 
The numker of segments that will result ina Draftsman 
display can be calculated by squaring the number of vari- 
ables in the data set and dividing by 25. in practice, ee 
display cf moce than 15 variables becomes somewhat unwieldy 
and may negate the benefits of using a Draftsman 


methodology. 


Be. TECHNICAL DETAILS OF INPUT REQUIREMENTS 


A two dimensional array of data and a two dimensional 
array of the data variable names are the minimum ingfut 
parameters reguired to generate a Draftsman display. These 
parameters are inputted as prompted by the routine ADMIN. 

Data may be input directly as an APL variable of 
retrieved from a CMS file located on the user's disk. File 
reading is accomplished by CMSREAD [Ref. 6], a Jlibmpame 
routine which has teen pre-copied into the Draftsman 
WOCKS[Aace. 

A program entitled SUB was written to assist in the 
restructuring of data sets into more convenient formats. An 
lnitial analysis of the basic Draftsman display may reveal 
certain variables or sections of data pointS which warrant 
closer scrutiny. The SUB program allows the user to select 
variables from the original data set as well as subsamples 
of a variable in order to create a new data set entitied 
DATA. CATA becomes the global variable that is actually 
displayed. The APL program SUB which implements this proce- 
dure 1s found in Appendix A. 

The matrix of variable names is either input directly as 


an APL two dimensional array or is generated by the routine 


ag 


Pace S. (ever Ouse Oa see DdtElx COrres vonG to Seacn of the 
Poole smeyn tne data Set watenh 15 to be displayed. hea 


Memeratodso)y LAbsbo,eeede. VvatialLle gane entered containing 


less than 20 characters 15 padded wita plank spaces. ee 
more than twenty characters are entered, Onn etac <L1LCSt 
Beemty Characters will appear on the dispiay. This assures 


trat the entire array when passed td Succeeding routines 1s 
meevalid rectangular character array. Teme LABELS Loutine 


which implements this procedure 1s found in Appendix A. 


C. ENHANCEMENT ROUTINES 


1. Jitter Routine 


SS ae ee — a 


As Giscussed in Chapter II, overlapping of plotted 
values may be misleading and inadeguately portray the visual 
relationships exhibited in the data structure. The soiution 
is to jitter or add random noise to one or botn variables to 
re plotted. This technique 1S presented aS an enhancement 
option to the user and reguires only an identification of 
the variables upon which jittering will be performed. 

hacmmiguat tering Sieevanrisab le points within the 
Draftsman program 1S accomplished through a method discussed 
mummers pRet-s 1 spp. 106-107]. We let Ui, for i=l too { 
the number of observations ), be n equally spaced values 
meome=1 tO +1 in randem order. The original variable values 


are thus reexpressed in jittered form Ji, 


= X, +0, U; (eqn 4.1) 


1 


where @x is .05 times the range of the variable data values. 
mmeemmethod results in a fractional Sniftt of the data values 


along the same axils in which the variable is plotted. 
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The saail shifting of plotted pointS is Surtees 
to negate the effects of overlaps while preventing any 
Serious corruption of the pletteaq@idatae It shows the 
Multiplicity of points at each actual coordinates [ieweneee 
inal data values can ke recovered dy rounding to the nearest 
integer. Internally the Draftsman program uses tne original 
data to create local variables which are jittered dnd Gio 
pilot ted: This enables the user to always Maintain the data 


Iie oOL legal fori. 


Zo, iGanstOrtation soul 1ne 


a Sw 





The potential for transforming variables was written 


as auser option to further enhance the basic Draftsman 


display. This routine maximizes the characteristics of the 
APL primitive functions as well as parallel array 
processing. The program requires the identification of the 


variables desired to re transformed and the appropriate APL 
expressions for €aGh than srormat aes The parallel 
processing capability transforms each variable set in one 
Operation. AS in the jitter routine, the transformation 
routine transforms only local variables for plotting Jama 


leaves the original data structure intact. 
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V. AN ANALYSIS OF AUTOMOBILE DATA 


_— a 2p eee —_— ma eee —_—=——_ _ == 


Ae LUNIRCDUCTION 


An analysis is presented of data consisting of selected 
chacacteristics of automobiles manufactured during 1983 and 
tested by Consumers Union. The primary purpose of this 
@rapter is to demcnStrate an application of Draftsman 
displays in exploratory data analysis. The analysis 
initially explores the generai descriptive qualities of the 
characteristics of automobiles using the basic Draftsman 
display procedures. Subseguent analysis focuses on observed 
variables cf interest as developed through the enhancement 


features of Draftsman. 


Be. THE AUTCMOBILE DATA 


The data was initially formatted as a two dimensional 
array consisting of 106 rows and 14 colunns. Edie OW Oi 
the data matrix corresponds to one of the 106 ditferent 
models of automobiles as tested by Consumer Union [Ref. 2 
wep. 320-356 ]. The columns contain various characteristics 
for each of the automobiles. These fourteen variables 
comprise the three general categories of price, performance, 
and size. The price category consists of the sugyested 


retail price of the basic automobile witnout additional 


options. The performance variables include fuel efficieacy 
(city and highway), turning diameter, Gears ratio, and 
vehicle repair records ‘for the two preceeding years. The 


Size variables consist of length, weight, headroom, rear 


seating space, tEUMKeSLZe, and engine displacement. A 
general variable, automobile, corresponds to each of the 
specific models upon which the data is based. A summarized 


description of the data is shown in Tapdle II for reference. 
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DABLES es 


Automobile Data Characteristics 


VARTABLE UNITS REMARKS 


Automopn!{tle 1 to 48: small cars 
49 to 98 mid cars 
99 to 106; large cars 
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Price $1000 

MPG City Miles per gallon EPA rated 

MPG Highway 

Repafr 81 O0= mot rated 

Renatr 82, a3 Vee cneny 
2= poor 
3= average 
4= good 
5= very good 

Headroom Inches 

Rear seat space Inches 

Trunk sfze Cunfe feet 

Wefgnt pounds 

Length fnenes 

Turning radfus Feet 

Engine displacement Cubfc {fnches 
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| Gear ratio 


C. PRELIMINARY ANALYSIS 
1. General 


The general Draftsman display of variables was 
generated as discussed in chapter II. A reduced version of 
the basic displays is seen in figures 5.1 through 5.4 . The 
actual Draftsman displays used for analysis Ray be £LOUnuiee 
Appendix B. For convenience and clarity, individual SsGapees 
plots from the displays will be included within appiiveaeme 


sections of the text. 
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segment 4 of Automobile Data. 


5.4 


Figure 


ihe S@genheral Gharacteristics OMe AUT OmOul Les are 
fexely to be familiar to most readers. Intuitively we can 
perhaps surmise iaiiemotme the LeladtionsSnips of the data struc- 
ture without even looking at it. This familiarity however 
will enakle us to concentrate more on the features of the 
DraftSman program in exploring the data. Additionally, we 
May contirm intuitive knowledge or perhaps change some of 


our perspectives based upon the analysis. 


Focusing on price as relating to the other parame- 
ters, the first visual message imparted in figure 5.5, is 
that price bands deliniate the major categories of automo- 
biles. Generally the small sized cars are grouped at under 
$10,000 while midsize models are rather tightly grouped 
between $7,500 and $12,000. If we concentrate on deviations 
from this pattern, the outliers reveal an interesting 
feature. From each major size category to the next there is 
a substancial increase in the aumber of outliers within the 
categories. These outliers are predominately luxury models 
within their respective categories. 

When price and weight are compared, a gentle upward 
Sloping trend dominates, denoting that price and weight are 
positively related, which 1s to pe expected (figure 5.5). 
This relationship levels off at about $10,000. A very 
obvious Eranch from the main trunk of observations shows 
price increasing relative to weight at a greater rate. The 
presence of this branch suggested additional research to 
determine 1f a significant parameter was missing from the 
data. The research revealed this uppermost branch consisted 
of luxury style models, with all but one of foreign manufac- 
ture. The majority of the outliers contained between the 
two Branches are the luxury models of American origin. We 


Might conclude that weight is generally associated with an 
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Figures oe) Characteristics of Price. 


increase in price, with the foreign luxury models tending to 
be more expensive than American luxury models of comparable 
welght. 

Similiar upward curving  pelationstmes as that 
observed in the scatter plot of price versus weight can be 
seen in some of the other plots in the price row in figure 
De Sitars The plot of price and weight is most closely resen- 
bled by that of price and length. This however may be some- 
what deceiving. A little thought might lead us to conclude 
that these similarities are more due to a relationship 
between length and weight. Consistent with this are the two 
plots containing the parameters of rear seating space and 
trunk size versus price. Although they loosely resemble the 
pattern of the price versus weight plot, we should suspect 
that they are influenced more by the overall dimension of 
automobile length. These plots demonstrate the care that 
Must be taken in the analysis of single scatter plots. He 
must be cautious since each scatter plot in the array 
denotes only the isolated relationship of two variabies and 


May not necessarily indicate a causal relationship. 
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ifeorger tc elininate Sicmeovernlapping of -the 
discrete valued maintenance ratings, jittering was required. 
If we look at price with respect to the two preceeding years 
of maintainability in figure 5.5, we observe that the vast 
Megonity of the vehicles rated cost less than $13,000. 
Vehicles on which no maintenance records were available are 
denoted Ly a 0. Significant of the rated vehicles is that 
they are very evenly distributed across all levels of nmain- 
memance Scores. This suggests that price alone does not 


insure the maintainability of an automobile. 


PeEeecrerracteristics of Size 


=e — — ee eee —_——_ SE Se ee 


As an overall expression of size, weight and length 
show a fairly tight linear relationship as depicted in 
mragucre 5.6 . Comparing these two variables across their 
respective rows indicates that they both appear to manifest 
Similiar relationships with the other parameters. With 
respect to the size variables of headroom, rear seating, and 
trunk space, tighter relationships are seen with length. 
Weight on the other hand has a tighter relationship with the 
engine displacement parameter (see figure 5.6). 

Rear seating and trunk size are both generally 
lucreaSing relative to length. This observation is what we 
Might expect since a longer external size could reasonably 
result in larger 1nternal size features. Unusual however, 
is the factor of headspace, which deviates from the general 
trend of the other internal size dimensions. AS vehicle 
length increases in figure 5.6, there is a propensity for 
headspace to encompass a widening range of values. The 
broadest range is at 106 inches where headspace varies from 
1 to 4 inches. As vehicles become even larger, the head- 
Space sizes shifts to the larger values while in general 


being limited in range from 3 to 5 inches. 
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Pigure 5.6 Size and Internal Dimensions. 


The tendency for increased car weight to be associ- 
ated with a related increase in engine displacement is 
another cbservation we would expect, and 1S seen in figure 
eee as Cf significance, is that displacement versus weight 
fall ainto two diStinet types con eg rouse. Vehicles weighing 
up to 3200 pounds have a very tight increasing linear rela- 
tionship with engine displacements up to 175 cubic inches. 
The vehicles of larger weight capacity are seen to be asso- 
ciliated with larger engine displacements albeit with a more 
dispersed cluster of foints. 

Engine displacement in turn can be seen to havea 
definite correspondence to the overall automobile cateyo- 
ries. A close look at the second plot in figure 5.7/7 Loevoaak 


that small cars tend to be banded with engine displacement 
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ENG DISPLACEMENT 
60 150 200 250 00 
ENC OISPLACC MENT 
100 130 700 230 100 





eEom 50 to 50 cubic inches. Noticeable is the one small 
Sam Outlier, identified as the AMC Spirit 6 with a much 
larger displacement of 258 cubic inches. Medium category 
cars are fairly evenly distributed in two bands of engine 
displacement. The lower band spans displacements of 125 to 
160 cubic inches while the upper band is tighly spanned fron 
220 to 260 cubic inches in displacement. 

The outliers in the medium car class with signifi- 
cantly larger engine displacements were identified as the 
Chrysler Cordoba V6, Che ysler copdoba V8, and Lincoln 
Continental V8. Almost ali of the larger cars are clustered 
at the 300 cubic inch displacement level with two excep- 
eon S . Pies PiiGk Electr amo and Buick LeSabre V6 with 
displacements of 252 and 231 1in3 respectively have lower 
displacements. Notwithstanding the outliers, vehicle class 
and engine displacement are very correlated. Jverall, the 
deviations of the outliers in figure 5.7 have an inter- 
resting property. They are all of American manufacture and 
either deviate up or down one engine displacement group. 
These traits suggest that these vehicles may have previously 
been in a different size class and changes in some other 
characteristic features resulted in their being moved up or 


down an automobile class. 


4 2 


4. Vehicle performance 


Consumers should have a particular interest ine 
fuel eificiency characte riStweSs oz automobiles. Not 
surprising 1s the trend shown in figure 5.8 that “Gorretages 
fuel efficiency in the city with fuel efficiency on the 
highway. A comparison of the remaining variable plots for 
these efficiency parameters snows identical relationships in 
all cases. The original data structure could probably 
exclude one of these fuel efficiency variables without loss 


of information if we needed to condense tne data. 
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Figure 5.8 Fuel Efficiency, Weight, and Displacement. 


The inverse relationship of fuel efficiency to 
vehicle weight should not be unexpected and confirms our 
intuition in this regard (figuges- ie High fuel efi 
ciency, low weight, and smaller engine displacements are all 
associated. 

AS previously mentioned, price alone is not an indi- 
cator of automobile maintainability. An interesting obser- 


vation however can be drawn from the relationship exhibited 
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pemwech Lepadl& LTeCOLG from eme year to tne next. The piot 
Pambieealt G1 Versus Repair 82 ain figure 5.9 indicates a 
strong positive correlation between the two. tiara st all 
instances the maintainability does not change for better or 
worse by more than one level. Furthermore, the number of 
automobiles that improved, deteriorated, or did not change 


in terms of maintainability are approximately equal. 





Figure 5.9 Maintainability of Automobiles. 


Repaic when compared to engine displacement reveals 
that the predominant number of rated vehicles (i.e vehicles 
for which repair data was available) contained the smaller 
displacement engines. This concentration of better mainte- 
Nance values at the lower displacement level suggests that 


smaller engines have better maintenance records. 


De. ANALYSIS WITH ENHANCED DISPLAY 


1. General 


The analysis of the basic draftsman display revealed 
a wealth of features pertaining to the individual variables 
within the data. One distinct feature evident is the poten- 
tial relaticnship between foreign and American manufactured 
automobiles. While not an original parameter of the data, 
the fplots of price versus weight and price versus’7 model 


indicates that this influence warrants closer scrutiny. 
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In the basic display the overlapping of maintenance 
values was alleviated with jittering. The array of “piteme 
dealing with headroom space suggests that this variable also 
Should pe treated likewise. The remaining automobile char- 
acteristics do not suffer from any significant problensmyged 
overlapping values. 

Pased upon the preliminary analysis, an ennanced 
display was generated for subsequent evaluation. The redun- 
dancy of the two fuel efficiency variables was resolved by 
eliminating miles per gallon on the highway. The remaining 


variables were reordered to place those with similiar rela- 


tionships in closer proximity. The enhanced display also 
introduces a new discrete category variable, locatign@om 
manufacture. A value of 1 for this variable corresponds to 
those automobiles produced in America, those vehicles 


produced overseas under an American brand name are denoted 
with a 2, while foreign models are assigned a 3. 

The introduction of location of manufacture dramas 
cally portrays some very evident dichotomies which exist 
between foreign and American made automobiles. In general, 
the array of plots consisting of these parameters indicates 
avery different orientation on the part of the respective 
Manufactures in their approach to the automobile market. 

The potential for transforming the data through 
transformations was considered. Transitor2zing engine 
displacement with a log transform slightly straightens the 
plots containing this variable with respect to some of the 
other size farameters as seen in figure 5.10 . This reex- 
pression however does not really enhance the description of 
the data and hence was not included in the final display. 

The complete enhanced draftsman display may be found 
in Appendix B. Isolated portions of this display will be 


reproduced within this section of the text @ien ctaraae 
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Figure 5. 10 Log Transformation of Engine Displacement. 


Ze Frice 


MicwimjOclty OL AMGbiCameanc LOLeC1iJjN Cars are pricga- 
mey priced below $12,000. Extremely visible is the large 
Meeeing Of American models in the neignbornood of 39000 to 
memooO as depicted in figure 5.11. American model prices 
meyond this level tend to increasé in a rather uniform 
fashion of $2000 increments, up to $24000. ice Orerd st , 
foreign car prices are fairly unircornly distributed in the 
region of 75000 to $15000 with subsejuent price nikes in 


larger increments of $5000, to the maximua level of $35000. 
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Pegumwers. 1 1 Location of Manufacture and Price. 


3. Lze 


ie 


In yeneral, automobiles or American and foreign 


tenon Cali within two distinct size ranges (figure 5.12). 


inh terns of tne RAJOL dimensionS SE iShoGth apa Spee 
plot arrayS provide some distinguisShiany teadtures "Gortra- aaa 
Location “Or Danvtaqwuurc- Not very SUIPCiSing iS “Gaae 
American vehicles tend to the lonyer and heavier side whiie 


foreign manuractured cars tend to pe Shorter and lignter. 
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Figure 5.12 Location of Manufacture and Size. 


In view of the propensity for foreign cars to be 
Snorter than the American counterparts, the plots of the 
related inner size characteristics Shown in figure "S2iZee 
somewhat unexpected. An evaiuation o£ rear seating spacey 
LeuUne Size? and headroom shows that the distribution of 
values for foreign produced vehicles is slightly shittcames 
the smaller digensicns in contrasSte to the respecting 
American distributed values. This 1S consistent with olfser- 
VatlOnS noted in the casie display. What is unexpected, 1s 
that the differences based upon the sailits is nuch smailer 
than we might expect jiven the prevalent difference in 
length distributions Eetween American and foreign cars. 

in cone lus. en the rear seat spaciousness of 


American cars is fairly evenly distributed between 25 and 30 
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imehes with the large models being outliers at 32 incnes. 
Tvemfoteign models are more widely dispersed” from 20 to 2% 
inches. Thus at the upper end or the spaciousness scale 
there is actually only a three inch advantage by the largest 
of the American models. 

The characteristic of headroom denotes a similiar 
relationship as that observed for rear seating space (figure 
5.12). Again, approximately 50% of the foreign car headroom 
values fall within the same distribution range (2 1/2 to 4 
1/4 inches) as that of the vast majority of the American 
models. 

The differences between trunk sizes seen in figure 
5.12 is a bit more acute in that the foreign models range 
from 10 to 14 cubic feet, while the American cars are skewed 
toward 12 to 14 cubic feet. Clearly, in spite of the 
distinct length differences between foreign and American 
produced cars, the differences in internal dimensions is 
much suktler and smaller than we might have originally 
suspected. The foreign cars, althougn smaller in overall 
length , have approximately the same internal size features 
as all but the very largest American made cars. It is also 
interesting to note that the American sponsored Dut produced 
overseas models tend to exhibit tae characteristics of the 


foreign models. 


B Performance 


The distribution of the fuel efficiency characteris- 
tics of American and foreign automobiles appears to be the 
inverse of their weight (figure 5.13 ). The heavier 
American cars tend to be evenly distributed between 9 and 20 
mpg with only three outliers extending beyond 25 mpg. The 
lighter foreign cars, while ranging from 15 to 28 mpg, are 
Tather tightly grouped between 20 and 25 mpg. The outlier 


in this case is at the extreme range of 33 mpg. In terms of 
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Lue eerraclenc 74 the foreign venicles certainly dominate 
this attribute of perrornances 

Perhaps the most revealing plots 1n the enhanced 
display are tnose of the repair records. In ooth recop@eg 
years the Amecican models are rather evenly distributed from 
poorer than average (1) to average (3) mMaintainabil wey 
ratings. A better than average rating (4) was achieved only 
four times over both years. In extreme contrast, the 
foreign models during both years show a tendency toward the 
much better tnan average Maintdainabzlity rarzngm oe As in 
the characteristic of fuel efficiency, the foreign models 


dominate this performance variable of maintainability. 
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Figure. 5.45 Location of Manufacture and Performance. 


Es © GCONCTUSTONS 


The car data is an excellent example of how} 


Drarttsman display can be used to desicribe@a dag e The 
various parameters associated wito automobiles can be very 


confusing to the consumer. No one Single parameter can be 
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Selected aS an Overail measure of what constitutes a "best" 
automobile. What one consumer may find desirable, another 
consumer may find unacceptable. Thus, describing or 
modeling the data with more formal Statistical techniques 
Such as linear regression iS not very applicable. The 
Bemecoman Gisplay enables the user to observe the multivar- 
jate affects of each of the various parameters. Based upon 
the selection of one or more parameters, the user can deter- 


Mine the impact relative to other parameters. 
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Awe INTRCDUCTION 


A graphical analysis is presented using the Draftsman 
display on data § collect ed concerning selected Naval 
contracts signed during the period 1949 through 1963. Ties 
chapter explores the general descriptive qualities of eleven 
Categories of contractual YPotoriarcrer relative to the 
performance of tne contracts. 

The data originally was analysed ina Thesis completed 
in 1973 [Ref. 7], through regression and analysis of vari- 
ance technigues. Significant in this study was the cones 
sion that there was no clear method of describing the 
relationships between contract rarameters and the subsequent 
performance of the contracts. It is this authors opinves 
that the analysis failed because the use orf linear regres- 
slon alone is not sufficient to adeguately describe the 
relationships present in the data. The analysis presented 
based upon a Draftsman display suggests that this method of 
exploratory data and lvcus reveals a variety of relationships 
do exist describing contract performance relative to the 


contractual parameters. 


Bo VEHE 2CONTRA C fs Dp Asa 


The data consist of 177 contracts which compirse eae 
Naval aircraft and missile fixed-price incentive coOntrace. 
completed during the period 1949 througemelgoee The data as 
provided by the Naval Material Command encompasses 11 paran- 
eters as follows: 

1. Deviation from target cost {percenge 


2- Months to comclete Contract Mionena 
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Bemoaroe: Pre@tit Of Manufacturer (percent). 

4. Sharing ratio {percent). 

5. Ceiling price (percent of target price). 
PeubdE Cel COStmor Contract (Millions of dollars). 
7. Number of items produced in the contract. 

8. Number of contracts let that year. 

9. Year the contract was signed (see table III ). 
10. Contractor awarded contract (see table III ). 


11. Type of system (see table III). 


eS ee ee ee ee me 


TABLE II1 
Description of Variable Coding 













Codes for varfable 9 Codes for vartfable 1] 


YEAR SIGNED SYSTEM TYPE | 
1 = 1949 1 Uttifty Afrplane 
2 = 1950 2 Combat Afrplane 
5 3 Missfle 
& §81{mp | 
15= 6 
1363 5 Helfcopter | 
6 Orone 
7 Afrbdorne Equipment 
Codes for varfable 10 
MANUFACTURER 
1 Beech 7 Hflier 13. Ryan 19 Phfico 
2 LTV 8 Kaman 14. Sfkorsky 20 Maxson | 
3 Corvair Q Martin 15 §8e11 21 Northrop | 
4 Douglas 10 McDonald 16 Lockeed 22 Raytheon | 
5 Boefng 11 N. Amerficani7 8end!~x 23 Aerojet. 
6 Grumman 12 Verto] 18 Gen Elect. | 
J 


Bet HEORY OF FIXED-PRICE INCENTIVE CONTRACTS 


The concept behind fixed-price incentive (FPI) contracts 


is that they are intended to be used in the development, 


ay 


Managemeat supoort, and productiodhkh of WtensS Law we nee 
uncertainty of cost 1S too yreat t5 allow 4a 2 dams eee 
(FFP) contrast attractive tompmddeese In theory, tne FPI 
contract should motivate control of costs by fCewabdam gene 
manufacturer with a greater prorit level aS costs are 
reduced below the negotiated target sost. 

The incentive feature of the FPI contract should inézlu- 
ence the contractors to eaffectively manajye cost associated 
decisions ina manner beneficial to profit. TE1S in eae 
should result ina favorable cost outcome to thé government 
as well. This mutually favorable outcome is communicated in 
the form or the sharing ratio which establishes the amount 
of money which will be returned to the contractor for every 
dollar saved below the target cost of the progran. Fox 
example, a 75/25 sharing catio returns 25% of every dollar 
saved to th2 contractor while reducing the governments 
expected cost by .75 dollars. [Tne higher the percent 
returned, the greater the potential for gain or lose to the 
COME ac Gor. Hence, lower Sharing ratios reflect a greater 
degree of financial Fusk to wthe contractor. 

The ceiling price of a contract 1S a control measure to 
avoid excessive cost overruns to the government. Toe 
celling price establishes tune maximum amount of cost which 
will be paid by the governnent. Waen final cost exceeds the 
ceiling cost, the difference must b2 borne out of pocket by 
the contractoc as a loss. Cost outcomes which fall between 
the negotiatei target and ceiling values result in a break 


even venture to tne manufacturer. 
De PRELIMINARY ANALYSIS 
1. General 


A Draftsman display of the eleven contractual paran- 


eters was generated for preliminary analysis. Most 


215) 


Noticeable was that Nine of the eleven variables consisted 
of discrete values which resulted in substantial overlapping 
PMP lOotrca 29O1ntS  ERroughout tne display. ne plots 
containing the parameter of number 9f£ items produced indi- 
cates a problem in scaling. Contracts range in size from 40 
to 1400 items. This problem as shown in figure 1.1, is 
caused by the eight extrema outlier contracts containing in 
excess of 80) items. The compression of the remaining 
Majority of the sontracts into a vecy small segment of the 


plots would prevent observations of any meaningful value. 


NUMBER OF ITEMS 


NUMBER OF ITEMS 





4 


9 2 
O€V FROM TCT COST 





Figure 1.1 Number of Items per Contract. 


A sudsegquent Draftsman disolay was generated to 
alleviate the problem of overlap as well as the scaling of 
maumber of itens per contract. To take care of the overlap 
problem all variables exzsept deviation from target cost, 
tacget cost, number of items, and manufacturer were 
Petctered. To take care of the scaling problen, a log trans- 
formation was used on the variable of number of items. 

The Draftsman sejments generated with enhancements 
wetce reduced and are shown in figures 1.2 through 1.5 . For 
convenience and clarity of discussion appropriate plots will 
ke reproduced within the body of the chapter text. The 


Oorlginal display segments may be seen in Appendix C. 
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Draftsman Segment 4, 


Figure 6.5 


= haracteristics of Size 


—_> ——e—elUl lee 


a. Volume of Contracts 


AS a preliminary expression of size, the number 
of contracts per year provides an eStimnate for the volune of 
contracts let during a particular pericd ore ie We might 
suspect that a low volume of contracts would create a more 
competitive atmosphere among manufacturers as they attempt 
to maintain thelr facilities in a production nogce A hie 
volume of contracts let offers a greater opportunity for 
manufacturers to select contracts in which they have the 
greatest amount of expertise and experience. The latter 
case has a greater potential for controlling costs as well 
as manufacturer profits. Thus, we might expect that as the 
number of contracts let increasess, tne positive deviations 
from target cost should decrease. 

The plot of cost deviation versus number of 
contracts seen in the first plot of figure 6.6 does appear 
to generally support this hypothesis. As the volume of 
contracts increases there is a tendency for cost deviation 
to be negative. In fact, the greater the volume, the 
greater in magnitude the negative cost deviation. 

The cost deviations versus volume relationship, 
when compared over time, also suggest that the time at which 
the contracts were signed may have additional bearing (see 
PG 66 ones The rapid increase in contract volume froa 
1949 to 1951 is characterized by large absolute deviation 
from target cost (though generally negative). As volume 
declined from 1951 through 1955, the absolute deviations 
from target cost can be seen to be much smaller and roughly 
egually distributed tetween positive and negative. The 
Subseguent volume increase experienced from 1955 to 1958 
also shows a increase in the absolute deviations from target 


cost (with a fair tendency toward negative deviations). The 


ag 


Mer deciitwe strom 5a to 1904 1S somewhat more dirticulit 
Nnterpret due to the relatively low volume of contracts. 
lute deviations do appear to become slightly smaller as 
me decreases. However, when contrasted to previous 
s of similiar volume, the deviations while becoming 
Mon eeapeeda ee Once dering SO tO a lesser degree tnaan 
mousy. The last three years of the data period, while 
acterized both by low volume as well as a small absolute 
ation from target costs, clearly shows a tendency 


rds positive cost deviation. 
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Figure 6.6 Contract Volume. 


Bemeeon tract Duration 


Based upon production management technigues, we 
peexpect Sthat the duration of a contract would havea 
tionship to contract performance. SmoOme bent, COntracts 
e little time for management to adjust production activ- 
S to maxiaize the efficiency of operations. As contract 


tion increases, a greater opportunity is afforded to 
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contractors t9 learn Ey early production GEEOrS = amaega eee 
cost related decisions necessary for CONES ore wnaen 
contracts duration extends far into the future, 2ife meus 
can arise by external economic intlueaces which could not be 


accurately forecast at the onset. 
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Figure 6.7 Contract Duration and Performance. 


The isolated plot of deviations from cost rela- 
tive to contract duration reproduced in figure 6./ Teveaia 
some interesting features. The contracts of less than 40 
months duration exhitit widely dispersed deviations from 
target cost. For the contracts which lasted between 40 and 
70 months the cost deviations exhibit a clear trend towame 
the negative side. Contracts waich exceed 70 months in 
duration shew an increased deviation that is roughly equally 
split between positive and negative . 

The cost deviation characteristics relative to 
duration notei above appear to hold irrespective of the year 
in which the contracts were signed (see figure 6.7 ). 
Contracts of less than 40 months duration aS Swell as @ineee 


ketween 40 and 70 menths are fairly eyually distributed 


onl 


Gemecicma:l years Of the data period. (ie time OlElacts 
which exceed 70 months in duration are eifected by the year 
Signed is not determinable since all but one of these 


contracts occured during the first five years. 
Geman ges=COst, OL Contracts 


The target cost parameter provides a measure of 
the financial size of the contracts let. Pew Ot Of tnis 
variable with respect to deviation from target cost is shown 
in figure 6.8 .« The greatest absolute deviation from target 
cost can be observed when target cost is less than 100 
faeeiaion dollars. In this region, the deviations tend to be 
negative but only by a slight numerical margin. The 
contracts which exceeded a target cost of 100 million 


dollars are clearly seen to exhibit 2 smaller absolute devi- 


ation from the target cost. These contracts are further 
characterized by generally favoring a negative cost 
deviation. 





pase 


Figure 6.8 Target Cost and Performance. 


3. incentive Measures 


aneEsowenain dg Ratio 


The sharing catio establishes the amount of 


Bemey Which wili be returned to the contractor for every 
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dollakt saved ln “Gose.- This return reflects the potenmrae 
for prorit ts the manufacturer. Toe higner the sharing 
ratio, the aigher the potential jain. This celationsiae 
appears to te echoed in the plot of tnese two variables wae 
seen in figure 6.9 .~ AS the Snaring ratio increases so does 
the relative expected profit levei of the manufacturer. The 
risk factor associated with contract duration Can” aJsouee 
opserved in the sharing ratio. As contract durldatieg 
increases the potential for influence by other external 
economic parameters can less accurately be forecasted. The 
general decline of sharing ratios aS contract durdataeg 
increases can be seen in figure 6.9 . This decline may bea 
sign oz the contractor's willingness to accept a Jowem 


Haginal profit poSltion it OFdGer told éereasaommus.. 
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Figure 6.9 The Incentive of Sharing Ratios. 


The most striking observation about the sharing 
ratio is that when evaluated with performance, little if any 
relationshif can be determined. We can see no clear indica- 
tion that any particular ratio can pe associated witamee 
favorable (negative) cost deviation. This lack of relation- 


Ship 12S Slgblifticant an that the Shacing ratio 1S suppoeseaim 


a major incentive feature Qf Tlxede amie incentive 
contracts. A determination that sharing ratios abe wag 
insignificant parameter suggests that this method of 
COMtrdecting might WaCra 1) Gee Orem analysis bY the 


Holme renmiente it tay she interesting to note tnat a Rand study 
Taemoverss00  Alr PMECe Contracts resuited in a similiar 
Pein g that 8the Sharing ratio was AhSignifticant with 


Peeect tor the Lindl cutcome of Contracts [Ref. 8 p.38]}. 
be. TaEGet Profit 


The plots of negotiated target profit level in 
figure 6.10 reveal similiar characteristics when compared to 
the contract size parameters. Very evident is that target 
Peemtadt in general tends to revolve around the 9% level. 
Given the time period during which these contracts were 


performed, 9% represents a rather lucrative profit level. 


TARGET PROFIT 
' ’ 
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Figure 6.10 Target Profit and Size. 


Profit versus target cost is observed to be 
funnel shaped with the greatest deviations in profit at the 
lower end of target cost. AS target cost increases the 
variance of profit becomes smaller while stabilizing about 
the 9% level (figure 6.10). A similiar though looser rela- 
tionship exists with the number of items produced. Target 


profit deviates imperceptibly more initially and generally 
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tends to Zluctuate abcut 2 percentage DOLnts above Vana eewem 
the 9% level. The 9% profit level remains firm regardiess 
of the number ofmarens- 

Target profit plotted with duration and volune 
Shows a slight decline in expected profit levels as either 
duration or volume increase as depicted in figure 6.10 5 9G 
the former situation this may bea willingness to trade-off 
a lower profit level ror the security of a longer tern 
Production operation. The inverse relationship between 
profit and volume suggests that during low volume periods 
contractors are attempting to make the most out of the few 
contracts available. Conversely when contract volume is 
high the expected profit level declines to about 9%. the 
conclusicn might be drawn tnat witn g@ore contracts avail- 
able, the contractors are willing to accept a slightly lower 
profit level per contract since the opportunity 1s greater 
to win multiple contracts during high volume periods. 

AS a measure of performance the target profit 
may lack significance in determining the deviation fron 
target Gost. The plot of target profit versus deviation 
from target cost shown in figure 6.11 does not suggest a 
describable relationship. The majority of tne manufacturers 
tended tc negotiate about 34 9% Bretltmievel An anlysis ee 
the eight most deviant outliers from this characterisiag 
reveals that seven of these outliers were by manufacturers 
With these contracts being their sole participation during 
the entire 15 year period. The comparison of target profit, 
deviation fron target cost and system type 1s also signifi-= 
cant with respect to the outliers: Ahile there is nothing 
notable about their target cost, the eleven most unfavorable 
cost outcomes (positive cost deviations) correspond soiely 
to three system tyfes. These are combat aircrafts, 
missiles, and helicopters denoted by item types 2, 3, and93 


respectively ib Zigure Gaiam 
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Figure 6.11 Target Profit and Performance. 


Ew. PRELIMINARY CONCLUSIONS 


The 
abundan 


display 


analysis and discussion presented reveals that an 
t amount of information is visible in the Draftsman 


Ouprne contract’ data. Further, there are indica- 


tions that there are relationships between the contractual 


parameters and contractor performance. Briefly summarized 


these are: 


1. 


As the volume of contracts let increases there isa 
tendency for the contracts to result in a negative 
Somat atrOn (favordble to the government). 

Over the 15 year period as volume changed from year 
to year there appears to be a related reaction rela- 
tivcmEOMmCOnUthaGeOon perionmance. Periods denoted by 
an increasing volume are reflected with an increase 
in cost deviations. when volume declines a related 
decline in cost deviations can also be observed but 
at a more cautious rate. 

Contkracewaquration as related to cost deviation might 


retter be described in terms of short, medium, and 
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leng tern. Copteacese The a20osSt favorable Contras 
duration appears to be between 49 and 70 months. 
This relationship is fairly consistent regardless of 
the year in which the contracts were signed or the 
vclume of contracts let. 

4. Fluctuations in cost deviations tends to stabilize 
when the contracts contain more than 50 items. 

5. AS the target cost increases there 1S a gréater 
tendency for negative cost deviation to 6o¢ciee 
Contracts 1n excess or 100 million dollars in partic= 
ular resulted in predominantly ravorable outcomes. 

6. The sharing ratio as a traditional incentive measure 
Of a FPi contract Day Bacwenecat. No relationship 
can be observed between this parameter and perforn- 
ance. 

7. No obvious relationship can be noted between target 
profit levels and contract performance. 

8. The ultra high technology systems of combat aircraft, 
helicopters and missiles exhibit the greatest poten- 


tial for adverse performance. 


Fe. ADDITIONAL CONFIRMATORY ANALYSIS 


1. General 


The preliminary analysis uSing a Singie iteration of 
the Drarftman display revealed a variety of interesting rela- 
tionships between contractual parameters. Certainly other 
relationships exist which have not been discussed. AS an 
exploratory data analysis tool the Draftsman display enables 
the user to look at the data at almost any level of detail 
desired. Subseguent displays can be generated on various 
subpopulations such as each of the manufactures to gain 
greater insight to their performance behavior. it 1S wigs 


versatility in exploring data sets wnich enables the user to 


on 


Pacey erOCesommlaro@e anountS@eor data in order to gain a 
feeling for the interactions involved. 

The use of the Draftsman display can also assist the 
user in thea application of more TOuMae estatist: cal 
approaches. The use of such technigues as regression anal- 
YVslS provides a confirmatory measure to the exploratory 
indications viewed in the display. 

The blind application of some statistical packages 
without first looking at the data can result in errcneous or 
misleading conciusions. [peers particularly true of large 
data sets where misplaced decimal points, fornating errors 
and other related problems may not be easy to detect. The 
visual nature of the Draftsman display can assist in identi- 
fying these problems as well as aid in selecting appropriate 


variakle selections on which to initiate formal analysis. 


2. Cost Deviation Over 


is 


ne 


One major question which cannot be readily answered 
wee the Draittman display of contract data is the relation- 
Ship of cost deviations over the entire contract period. 
The scatter plot of percent cost deviation versus year 
Signed reveals a wide dispersal in cost deviations with no 
clear visual trend apparent (figure 6.12). 

A least squares linear model was selected in order 
to determine if for all manufacturers a trend exists 
Belating cost deviations to the year in which contracts were 
Signed. Mier esultsS wor this Indicates that in fact an 
Meer d trend in cost deviations did occur from 1949 through 
ieee (figure 6.12). Paeweonmouted t-value of 3.3 1S guite 
Sgmiricant and indicates that the probability that the 
value of the coeficient B(1) was actually zero is much less 
toan .05. The slope of the regression line is .580 with the 
lower and upper confidence intervals .232 and .927 respec- 
tively. This also clearly supports the upward trend of cost 


deviations. 
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Figure 6.12 Contract Performance Over Time. 


While an overall increasing trend in cost deviations 
1s evident, the performance of the individual manufactures 
1nvolved might reasonably be expected to differ. The gener- 
ation of Draftsman displays for each manufacturer would 
provide the starting point for comparison. While the entipe 
displays are not presented, the scatter plots of two major 
contractors (Grumman and Lockeed), indicate how ditterene 
performance results relate in the general cost deviation 
prct ure. | 

In applying the linear resressior model to Grumman 
aS seen in figure 6.13, cost deviations rose rapidly Joves 
the time period. This rise is much faster than that Sechium 
figure 6.12 for all the firms in general. From a government 
perspective this might suggest that a closer scrutiny of 
this company's activities might be waGranecoas 

The application of the regression model to Lockeed 
as seen in figure 6.14 presents a very different picture. 


In this instance a cubic fit rather than a Straight inom 
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Figure 6.13 Grumman Contract Performance. 


more appropriate in describing tne relationship of cost 
jeviations over time. Cost deviations appear almost cycl- 
hed . Particularly noteworthy is the difrerence between 
scumman and Lockeed during the last four years of the data 
oe Liod. Grumman cost deviations continue to rise while 
Lockeed experienced a sharp decline in deviations. Quite 
likely there are external considerations which are intiu- 


Sicing cost for each of the manufacturers. 


70 


ee 











COEFFICIENT 8/S1G4A(9) COMFICENCE INTERVAL } 
TERM 8 $1GMA(9) t LOWER UPPER 
BO 13 9193 6 802 2.045 =075 27 898 | 
8! Nore sz 3 629 T2,950 “18 293 ao OG) 
| 82 1.837 . 5356 3.4359 .714 3.001 | 
83 ~.089 cas 73.469 ".14) -.037 
THE THECRETICAL VALUE FOR T AT THE 0.05 LEVEL AND 26 OF «= 2.056 
LOCKEED 
| COSTOVR(;10J=16 
> 
| 
| 
3 
© 
a | 
=a | 
2 
oO 
ES 
1 
j | 
| 
| 
2 4 8 8 10 12 | 
YEAR SIGNED 
_ —— es es 
Figure 6.14 Lockeed Contract Performance. 
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